Communication-efficient Sparse Regression

نویسندگان

  • Jason D. Lee
  • Qiang Liu
  • Yuekai Sun
  • Jonathan E. Taylor
چکیده

We devise a communication-efficient approach to distributed sparse regression in the highdimensional setting. The key idea is to average “debiased” or “desparsified” lasso estimators. We show the approach converges at the same rate as the lasso as long as the dataset is not split across too many machines, and consistently estimates the support under weaker conditions than the lasso. On the computational side, we propose a new parallel and computationally-efficient algorithm to compute the approximate inverse covariance required in the debiasing approach, when the dataset is split across samples. We further extend the approach to generalized linear models.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Communication-efficient sparse regression: a one-shot approach

We devise a one-shot approach to distributed sparse regression in the high-dimensional setting. The key idea is to average " debiased " or " desparsified " lasso estimators. We show the approach converges at the same rate as the lasso as long as the dataset is not split across too many machines. We also extend the approach to generalized linear models.

متن کامل

Sparse Regression Codes

Developing computationally-efficient codes that approach the Shannon-theoretic limits for communication and compression has long been one of the major goals of information and coding theory. There have been significant advances towards this goal in the last couple of decades, with the emergence of turbo and sparse-graph codes in the ‘90s [1, 2], and more recently polar codes and spatially-coupl...

متن کامل

L1-Regularized Distributed Optimization: A Communication-Efficient Primal-Dual Framework

Despite the importance of sparsity in manybig data applications, there are few existingmethods for efficient distributed optimizationof sparsely-regularized objectives. In thispaper, we present a communication-efficientframework for L1-regularized optimization indistributed environments. By taking a non-traditional view of classical objectives as partof a more ge...

متن کامل

Robust Estimation in Linear Regression with Molticollinearity and Sparse Models

‎One of the factors affecting the statistical analysis of the data is the presence of outliers‎. ‎The methods which are not affected by the outliers are called robust methods‎. ‎Robust regression methods are robust estimation methods of regression model parameters in the presence of outliers‎. ‎Besides outliers‎, ‎the linear dependency of regressor variables‎, ‎which is called multicollinearity...

متن کامل

CoCoA: A General Framework for Communication-Efficient Distributed Optimization

The scale of modern datasets necessitates the development of efficient distributed optimization methods for machine learning. We present a general-purpose framework for the distributed environment, CoCoA, that has an efficient communication scheme and is applicable to a wide variety of problems in machine learning and signal processing. We extend the framework to cover general non-strongly conv...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Journal of Machine Learning Research

دوره 18  شماره 

صفحات  -

تاریخ انتشار 2017